Overview

Dataset statistics

Number of variables11
Number of observations8159536
Missing cells0
Missing cells (%)0.0%
Duplicate rows74091
Duplicate rows (%)0.9%
Total size in memory684.8 MiB
Average record size in memory88.0 B

Variable types

Text3
DateTime1
Numeric6
Categorical1

Alerts

Dataset has 74091 (0.9%) duplicate rowsDuplicates
category_id is highly overall correlated with department_id and 2 other fieldsHigh correlation
department_id is highly overall correlated with category_id and 2 other fieldsHigh correlation
parent_id is highly overall correlated with category_id and 2 other fieldsHigh correlation
salesperson_id is highly overall correlated with category_id and 2 other fieldsHigh correlation
quantity is highly skewed (γ1 = 41.78601045)Skewed

Reproduction

Analysis started2023-12-18 09:22:57.536218
Analysis finished2023-12-18 09:29:03.326039
Duration6 minutes and 5.79 seconds
Software versionydata-profiling vv4.6.3
Download configurationconfig.json

Variables

Distinct3779936
Distinct (%)46.3%
Missing0
Missing (%)0.0%
Memory size62.3 MiB
2023-12-18T09:29:07.148323image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length32
Median length32
Mean length32
Min length32

Characters and Unicode

Total characters261105152
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1938335 ?
Unique (%)23.8%

Sample

1st row2e18343f9b9a95e89587273536e59d6e
2nd rowd53096e90b515b563631b18acfa4d364
3rd rowd53096e90b515b563631b18acfa4d364
4th row2c2296658e7f9ae94954b5836214de76
5th row2c2296658e7f9ae94954b5836214de76
ValueCountFrequency (%)
5ad24284bf48eee984a16d0f94a380c4 228
 
< 0.1%
9e7effb23dd7557287e6672d57cce251 205
 
< 0.1%
c8ba6c7971f047170993d652aded77c9 189
 
< 0.1%
fe261cabab8a92b4e56ab2abbf9440ba 187
 
< 0.1%
58ece7e4171e4020f7f3774797f411fc 171
 
< 0.1%
198fab4051313b11b79e399d0ced77fa 166
 
< 0.1%
5d4be88addad9ae301fcbb7a917ed6dc 144
 
< 0.1%
a0527cd9ba6579ec4fa00494fd15a875 138
 
< 0.1%
98a033d8a20192765b4425c95ce3d8de 138
 
< 0.1%
7c04514f81c60e2eb7c37c76ed5ebdf6 134
 
< 0.1%
Other values (3779926) 8157836
> 99.9%
2023-12-18T09:29:11.296680image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8 16332932
 
6.3%
7 16332859
 
6.3%
9 16331253
 
6.3%
c 16329897
 
6.3%
4 16329878
 
6.3%
e 16325988
 
6.3%
2 16320689
 
6.3%
6 16319353
 
6.3%
0 16315773
 
6.2%
5 16314318
 
6.2%
Other values (6) 97852212
37.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 163212499
62.5%
Lowercase Letter 97892653
37.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
8 16332932
10.0%
7 16332859
10.0%
9 16331253
10.0%
4 16329878
10.0%
2 16320689
10.0%
6 16319353
10.0%
0 16315773
10.0%
5 16314318
10.0%
3 16308116
10.0%
1 16307328
10.0%
Lowercase Letter
ValueCountFrequency (%)
c 16329897
16.7%
e 16325988
16.7%
a 16312954
16.7%
d 16311965
16.7%
f 16306584
16.7%
b 16305265
16.7%

Most occurring scripts

ValueCountFrequency (%)
Common 163212499
62.5%
Latin 97892653
37.5%

Most frequent character per script

Common
ValueCountFrequency (%)
8 16332932
10.0%
7 16332859
10.0%
9 16331253
10.0%
4 16329878
10.0%
2 16320689
10.0%
6 16319353
10.0%
0 16315773
10.0%
5 16314318
10.0%
3 16308116
10.0%
1 16307328
10.0%
Latin
ValueCountFrequency (%)
c 16329897
16.7%
e 16325988
16.7%
a 16312954
16.7%
d 16311965
16.7%
f 16306584
16.7%
b 16305265
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 261105152
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8 16332932
 
6.3%
7 16332859
 
6.3%
9 16331253
 
6.3%
c 16329897
 
6.3%
4 16329878
 
6.3%
e 16325988
 
6.3%
2 16320689
 
6.3%
6 16319353
 
6.3%
0 16315773
 
6.2%
5 16314318
 
6.2%
Other values (6) 97852212
37.5%
Distinct1256305
Distinct (%)15.4%
Missing0
Missing (%)0.0%
Memory size62.3 MiB
Minimum2011-01-01 09:04:00
Maximum2014-10-01 21:00:00
2023-12-18T09:29:11.616012image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:29:11.909827image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct204442
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Memory size62.3 MiB
2023-12-18T09:29:12.363033image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length32
Median length2
Mean length15.403575
Min length2

Characters and Unicode

Total characters125686022
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique26913 ?
Unique (%)0.3%

Sample

1st row-1
2nd row-1
3rd row-1
4th row-1
5th row-1
ValueCountFrequency (%)
1 4513971
55.3%
4acc7bfe5080965a63f05e1c3852a4ad 8583
 
0.1%
ff202c04cfc8394a5e43a11012e11d93 8577
 
0.1%
2593d7b4a54b8a3fd01144d17f1949a5 8573
 
0.1%
3ff2a371691a832d9ba1d51fdaeac07b 8547
 
0.1%
712e4c5edc1e1e67ea67061f78678612 8533
 
0.1%
5469d95f04a89400f2491ea0a9653dee 8494
 
0.1%
4f59ff2f20f2ad74aae3eaf596c8f978 8487
 
0.1%
d34e761991ca5b798972ec7e328253be 8484
 
0.1%
f17ece610fb7cc5935a92f68fe387e0e 8481
 
0.1%
Other values (204432) 3568806
43.7%
2023-12-18T09:29:13.090368image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 11986309
 
9.5%
a 7465781
 
5.9%
8 7446464
 
5.9%
5 7411674
 
5.9%
f 7392447
 
5.9%
d 7344098
 
5.8%
c 7300016
 
5.8%
3 7284894
 
5.8%
9 7248636
 
5.8%
0 7247263
 
5.8%
Other values (7) 47558440
37.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 77413991
61.6%
Lowercase Letter 43758060
34.8%
Dash Punctuation 4513971
 
3.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 11986309
15.5%
8 7446464
9.6%
5 7411674
9.6%
3 7284894
9.4%
9 7248636
9.4%
0 7247263
9.4%
2 7244423
9.4%
4 7209903
9.3%
6 7175653
9.3%
7 7158772
9.2%
Lowercase Letter
ValueCountFrequency (%)
a 7465781
17.1%
f 7392447
16.9%
d 7344098
16.8%
c 7300016
16.7%
e 7246027
16.6%
b 7009691
16.0%
Dash Punctuation
ValueCountFrequency (%)
- 4513971
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 81927962
65.2%
Latin 43758060
34.8%

Most frequent character per script

Common
ValueCountFrequency (%)
1 11986309
14.6%
8 7446464
9.1%
5 7411674
9.0%
3 7284894
8.9%
9 7248636
8.8%
0 7247263
8.8%
2 7244423
8.8%
4 7209903
8.8%
6 7175653
8.8%
7 7158772
8.7%
Latin
ValueCountFrequency (%)
a 7465781
17.1%
f 7392447
16.9%
d 7344098
16.8%
c 7300016
16.7%
e 7246027
16.6%
b 7009691
16.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 125686022
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 11986309
 
9.5%
a 7465781
 
5.9%
8 7446464
 
5.9%
5 7411674
 
5.9%
f 7392447
 
5.9%
d 7344098
 
5.8%
c 7300016
 
5.8%
3 7284894
 
5.8%
9 7248636
 
5.8%
0 7247263
 
5.8%
Other values (7) 47558440
37.8%
Distinct94102
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size62.3 MiB
2023-12-18T09:29:13.751242image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length32
Median length32
Mean length32
Min length32

Characters and Unicode

Total characters261105152
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10979 ?
Unique (%)0.1%

Sample

1st row006ae35b6f0aae363ff038ffd44ad049
2nd row780024e0152928b310df607663294dd4
3rd row22981f293030ae132845164a0ba728e4
4th row6c3d6e8b176b711d066df803470efbaa
5th row7916bb6025e4b42bbdbf48e1faacad90
ValueCountFrequency (%)
cd784f9c80112f207a06e23328ce0edb 85140
 
1.0%
4eb0ed65f0f049a72e10c2949d09183c 51342
 
0.6%
ed92ef3044e2b4c0978a26b787f08fed 43302
 
0.5%
61c161228c0bf8c5ce7ddb6c94465971 33152
 
0.4%
c6f818a8a9fd204094af38b1b8df93e4 30456
 
0.4%
07eaaef37bcf9fe8f8332ce94dc64652 30057
 
0.4%
5f616d6c685f8907deaf6778821ab3d8 29930
 
0.4%
cbbcf199e72782c1e61166d3e0197603 27626
 
0.3%
5cd8b51aee5f4501dd649e5b11f6ca8f 22644
 
0.3%
b737da1bfe91eca560c62a26b8478aaf 21536
 
0.3%
Other values (94092) 7784351
95.4%
2023-12-18T09:29:14.979976image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 16962593
 
6.5%
0 16709056
 
6.4%
8 16523573
 
6.3%
4 16518134
 
6.3%
6 16389893
 
6.3%
2 16356584
 
6.3%
3 16340057
 
6.3%
7 16325711
 
6.3%
b 16311496
 
6.2%
9 16215004
 
6.2%
Other values (6) 96453051
36.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 164272803
62.9%
Lowercase Letter 96832349
37.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 16962593
10.3%
0 16709056
10.2%
8 16523573
10.1%
4 16518134
10.1%
6 16389893
10.0%
2 16356584
10.0%
3 16340057
9.9%
7 16325711
9.9%
9 16215004
9.9%
5 15932198
9.7%
Lowercase Letter
ValueCountFrequency (%)
b 16311496
16.8%
f 16186917
16.7%
c 16138763
16.7%
e 16091419
16.6%
a 16077959
16.6%
d 16025795
16.6%

Most occurring scripts

ValueCountFrequency (%)
Common 164272803
62.9%
Latin 96832349
37.1%

Most frequent character per script

Common
ValueCountFrequency (%)
1 16962593
10.3%
0 16709056
10.2%
8 16523573
10.1%
4 16518134
10.1%
6 16389893
10.0%
2 16356584
10.0%
3 16340057
9.9%
7 16325711
9.9%
9 16215004
9.9%
5 15932198
9.7%
Latin
ValueCountFrequency (%)
b 16311496
16.8%
f 16186917
16.7%
c 16138763
16.7%
e 16091419
16.6%
a 16077959
16.6%
d 16025795
16.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 261105152
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 16962593
 
6.5%
0 16709056
 
6.4%
8 16523573
 
6.3%
4 16518134
 
6.3%
6 16389893
 
6.3%
2 16356584
 
6.3%
3 16340057
 
6.3%
7 16325711
 
6.3%
b 16311496
 
6.2%
9 16215004
 
6.2%
Other values (6) 96453051
36.9%

quantity
Real number (ℝ)

SKEWED 

Distinct18825
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.018986
Minimum-70
Maximum189
Zeros6001
Zeros (%)0.1%
Negative1311
Negative (%)< 0.1%
Memory size62.3 MiB
2023-12-18T09:29:15.447398image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum-70
5-th percentile0.2865
Q11
median1
Q31
95-th percentile2
Maximum189
Range259
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.60034183
Coefficient of variation (CV)0.58915613
Kurtosis6292.098
Mean1.018986
Median Absolute Deviation (MAD)0
Skewness41.78601
Sum8314452.7
Variance0.36041031
MonotonicityNot monotonic
2023-12-18T09:29:15.980749image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 6904507
84.6%
2 329533
 
4.0%
3 53435
 
0.7%
0.5 30225
 
0.4%
4 29284
 
0.4%
5 7667
 
0.1%
6 6247
 
0.1%
0 6001
 
0.1%
0.2 4651
 
0.1%
0.1 3266
 
< 0.1%
Other values (18815) 784720
 
9.6%
ValueCountFrequency (%)
-70 1
 
< 0.1%
-20 2
< 0.1%
-15 1
 
< 0.1%
-10 1
 
< 0.1%
-9.712 1
 
< 0.1%
-9 1
 
< 0.1%
-8 1
 
< 0.1%
-7 2
< 0.1%
-6.791 4
< 0.1%
-6.4444 1
 
< 0.1%
ValueCountFrequency (%)
189 1
 
< 0.1%
160 2
< 0.1%
156 1
 
< 0.1%
120 3
< 0.1%
109 1
 
< 0.1%
107.3778 2
< 0.1%
96 1
 
< 0.1%
91 1
 
< 0.1%
90 4
< 0.1%
86 1
 
< 0.1%

price
Real number (ℝ)

Distinct11655
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.686815
Minimum-39.9
Maximum2432
Zeros60
Zeros (%)< 0.1%
Negative9
Negative (%)< 0.1%
Memory size62.3 MiB
2023-12-18T09:29:16.397113image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum-39.9
5-th percentile1.9
Q14.9
median9.95
Q316.5
95-th percentile39.5
Maximum2432
Range2471.9
Interquartile range (IQR)11.6

Descriptive statistics

Standard deviation16.202048
Coefficient of variation (CV)1.1837705
Kurtosis159.54173
Mean13.686815
Median Absolute Deviation (MAD)5.35
Skewness5.8110218
Sum1.1167806 × 108
Variance262.50636
MonotonicityNot monotonic
2023-12-18T09:29:16.922115image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5 237992
 
2.9%
7 174905
 
2.1%
10 174693
 
2.1%
15 115466
 
1.4%
12 113503
 
1.4%
8 112018
 
1.4%
4.5 111117
 
1.4%
3.4 110601
 
1.4%
4 100250
 
1.2%
13.5 83216
 
1.0%
Other values (11645) 6825775
83.7%
ValueCountFrequency (%)
-39.9 1
 
< 0.1%
-20.8 5
 
< 0.1%
-11.2 2
 
< 0.1%
-5.7 1
 
< 0.1%
0 60
< 0.1%
0.001 7
 
< 0.1%
0.0011 14
 
< 0.1%
0.0012 6
 
< 0.1%
0.0013 10
 
< 0.1%
0.0014 9
 
< 0.1%
ValueCountFrequency (%)
2432 1
 
< 0.1%
1948.72 1
 
< 0.1%
1752 1
 
< 0.1%
995 1
 
< 0.1%
987.56 1
 
< 0.1%
925.9 2
< 0.1%
925 4
< 0.1%
780 1
 
< 0.1%
660 1
 
< 0.1%
600 1
 
< 0.1%

category_id
Real number (ℝ)

HIGH CORRELATION 

Distinct398
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean236.08314
Minimum1
Maximum398
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size62.3 MiB
2023-12-18T09:29:17.216397image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile135
Q1201
median231
Q3266
95-th percentile359
Maximum398
Range397
Interquartile range (IQR)65

Descriptive statistics

Standard deviation68.375436
Coefficient of variation (CV)0.2896244
Kurtosis1.2882789
Mean236.08314
Median Absolute Deviation (MAD)32
Skewness-0.30599765
Sum1.9263289 × 109
Variance4675.2003
MonotonicityNot monotonic
2023-12-18T09:29:17.550071image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
220 805849
 
9.9%
222 339379
 
4.2%
255 328081
 
4.0%
250 269489
 
3.3%
235 217856
 
2.7%
201 209272
 
2.6%
179 186996
 
2.3%
232 177660
 
2.2%
183 163019
 
2.0%
175 146159
 
1.8%
Other values (388) 5315776
65.1%
ValueCountFrequency (%)
1 4
 
< 0.1%
2 5844
 
0.1%
3 2291
 
< 0.1%
4 20
 
< 0.1%
5 120
 
< 0.1%
6 34929
0.4%
7 6152
 
0.1%
8 3
 
< 0.1%
9 766
 
< 0.1%
10 2355
 
< 0.1%
ValueCountFrequency (%)
398 408
 
< 0.1%
397 5246
 
0.1%
396 1188
 
< 0.1%
395 96
 
< 0.1%
394 103
 
< 0.1%
393 2689
 
< 0.1%
392 7387
0.1%
391 3978
 
< 0.1%
390 16607
0.2%
389 4443
 
0.1%

parent_id
Real number (ℝ)

HIGH CORRELATION 

Distinct59
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean33.820996
Minimum1
Maximum59
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size62.3 MiB
2023-12-18T09:29:17.840896image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile17
Q125
median29
Q343
95-th percentile59
Maximum59
Range58
Interquartile range (IQR)18

Descriptive statistics

Standard deviation12.527912
Coefficient of variation (CV)0.37041817
Kurtosis-0.50480476
Mean33.820996
Median Absolute Deviation (MAD)10
Skewness0.26821554
Sum2.7596363 × 108
Variance156.94857
MonotonicityNot monotonic
2023-12-18T09:29:18.137034image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
25 989119
 
12.1%
24 851940
 
10.4%
29 765885
 
9.4%
59 505856
 
6.2%
27 462686
 
5.7%
42 408439
 
5.0%
41 377014
 
4.6%
43 293786
 
3.6%
19 284788
 
3.5%
40 254015
 
3.1%
Other values (49) 2966008
36.4%
ValueCountFrequency (%)
1 4
 
< 0.1%
2 23218
 
0.3%
3 15095
 
0.2%
4 59230
0.7%
5 1706
 
< 0.1%
6 35382
0.4%
7 7752
 
0.1%
8 35103
0.4%
9 118
 
< 0.1%
10 6602
 
0.1%
ValueCountFrequency (%)
59 505856
6.2%
58 42266
 
0.5%
57 85382
 
1.0%
56 9
 
< 0.1%
55 400
 
< 0.1%
54 8004
 
0.1%
53 55351
 
0.7%
52 64264
 
0.8%
51 43146
 
0.5%
50 143618
 
1.8%

store_id
Real number (ℝ)

Distinct39
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.705304
Minimum1
Maximum40
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size62.3 MiB
2023-12-18T09:29:18.442740image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q15
median8
Q318
95-th percentile31
Maximum40
Range39
Interquartile range (IQR)13

Descriptive statistics

Standard deviation8.9790947
Coefficient of variation (CV)0.76709624
Kurtosis0.17346227
Mean11.705304
Median Absolute Deviation (MAD)5
Skewness0.98897472
Sum95509850
Variance80.624142
MonotonicityNot monotonic
2023-12-18T09:29:18.724589image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%)
6 917002
 
11.2%
7 834722
 
10.2%
2 518525
 
6.4%
3 473091
 
5.8%
5 448778
 
5.5%
8 423558
 
5.2%
1 368196
 
4.5%
22 336921
 
4.1%
4 336227
 
4.1%
9 274005
 
3.4%
Other values (29) 3228511
39.6%
ValueCountFrequency (%)
1 368196
4.5%
2 518525
6.4%
3 473091
5.8%
4 336227
 
4.1%
5 448778
5.5%
6 917002
11.2%
7 834722
10.2%
8 423558
5.2%
9 274005
 
3.4%
10 156648
 
1.9%
ValueCountFrequency (%)
40 17254
 
0.2%
39 20717
 
0.3%
38 13214
 
0.2%
36 49680
0.6%
35 97966
1.2%
34 41619
0.5%
33 52337
0.6%
32 65762
0.8%
31 77264
0.9%
30 30425
 
0.4%

department_id
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size62.3 MiB
2
4151760 
3
2198053 
4
1366211 
1
443512 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters8159536
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
2 4151760
50.9%
3 2198053
26.9%
4 1366211
 
16.7%
1 443512
 
5.4%

Length

2023-12-18T09:29:18.979163image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-18T09:29:19.215126image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
ValueCountFrequency (%)
2 4151760
50.9%
3 2198053
26.9%
4 1366211
 
16.7%
1 443512
 
5.4%

Most occurring characters

ValueCountFrequency (%)
2 4151760
50.9%
3 2198053
26.9%
4 1366211
 
16.7%
1 443512
 
5.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8159536
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 4151760
50.9%
3 2198053
26.9%
4 1366211
 
16.7%
1 443512
 
5.4%

Most occurring scripts

ValueCountFrequency (%)
Common 8159536
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 4151760
50.9%
3 2198053
26.9%
4 1366211
 
16.7%
1 443512
 
5.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8159536
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 4151760
50.9%
3 2198053
26.9%
4 1366211
 
16.7%
1 443512
 
5.4%

salesperson_id
Real number (ℝ)

HIGH CORRELATION 

Distinct625
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean322.54161
Minimum1
Maximum625
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size62.3 MiB
2023-12-18T09:29:19.463978image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile92
Q1165
median368
Q3427
95-th percentile577
Maximum625
Range624
Interquartile range (IQR)262

Descriptive statistics

Standard deviation156.71991
Coefficient of variation (CV)0.48589052
Kurtosis-1.1022361
Mean322.54161
Median Absolute Deviation (MAD)116
Skewness-0.1685921
Sum2.6317899 × 109
Variance24561.129
MonotonicityNot monotonic
2023-12-18T09:29:19.738214image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
108 787163
 
9.6%
417 763096
 
9.4%
428 206131
 
2.5%
214 185266
 
2.3%
427 123884
 
1.5%
461 105869
 
1.3%
584 99526
 
1.2%
446 84389
 
1.0%
409 83519
 
1.0%
511 80368
 
1.0%
Other values (615) 5640325
69.1%
ValueCountFrequency (%)
1 19
 
< 0.1%
2 998
 
< 0.1%
3 3988
< 0.1%
4 4417
0.1%
5 1002
 
< 0.1%
6 1290
 
< 0.1%
7 1385
 
< 0.1%
8 2756
< 0.1%
9 3301
< 0.1%
10 4081
0.1%
ValueCountFrequency (%)
625 345
 
< 0.1%
624 5083
0.1%
623 4376
0.1%
622 6323
0.1%
621 10212
0.1%
620 10912
0.1%
619 3779
 
< 0.1%
618 1217
 
< 0.1%
617 114
 
< 0.1%
616 4521
0.1%

Interactions

2023-12-18T09:28:16.802242image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:27:32.990716image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:27:41.285761image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:27:49.967293image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:27:59.221658image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:28:07.869700image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:28:18.201992image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:27:34.326483image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:27:42.937445image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:27:51.388298image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:28:00.662913image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:28:09.591999image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:28:19.581435image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:27:35.699507image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:27:44.428417image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:27:52.748343image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:28:02.097197image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:28:11.237776image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:28:21.108566image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:27:37.082202image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:27:45.806456image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:27:54.277535image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:28:03.502582image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:28:12.654436image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:28:22.758510image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:27:38.422795image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:27:47.179329image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:27:56.079949image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:28:04.922481image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:28:14.009704image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:28:24.372322image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:27:39.817543image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:27:48.551283image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:27:57.772253image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:28:06.347112image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2023-12-18T09:28:15.409702image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Correlations

2023-12-18T09:29:19.924038image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
category_iddepartment_idparent_idpricequantitysalesperson_idstore_id
category_id1.0000.9340.836-0.139-0.1480.848-0.227
department_id0.9341.0000.921-0.142-0.1780.919-0.241
parent_id0.8360.9211.000-0.101-0.1300.846-0.228
price-0.139-0.142-0.1011.000-0.217-0.1280.094
quantity-0.148-0.178-0.130-0.2171.000-0.1620.115
salesperson_id0.8480.9190.846-0.128-0.1621.000-0.222
store_id-0.227-0.241-0.2280.0940.115-0.2221.000

Missing values

2023-12-18T09:28:26.259075image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-18T09:28:33.262191image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

transaction_idsales_datetimecustomer_idproduct_idquantitypricecategory_idparent_idstore_iddepartment_idsalesperson_id
02e18343f9b9a95e89587273536e59d6e2011-01-01 09:04:00-1006ae35b6f0aae363ff038ffd44ad0491.013.520829182108
1d53096e90b515b563631b18acfa4d3642011-01-01 09:04:00-1780024e0152928b310df607663294dd41.06.517930172108
2d53096e90b515b563631b18acfa4d3642011-01-01 09:04:00-122981f293030ae132845164a0ba728e41.06.517930172108
32c2296658e7f9ae94954b5836214de762011-01-01 09:08:00-16c3d6e8b176b711d066df803470efbaa1.010.51752912108
42c2296658e7f9ae94954b5836214de762011-01-01 09:08:00-17916bb6025e4b42bbdbf48e1faacad901.017.52082912108
5d97df679d6c9d97fc9b5f96f2571f0042011-01-01 09:13:00-111ec26ec94745367e5835a7b2826ff641.04.51622912108
6fce651c756b339e8ccac5016a716abe82011-01-01 09:22:00-1718d92d49ddbe79e535f2040ad69ba541.02.81852512108
771a0acdaa0a56543a0f99aad44672b8b2011-01-01 09:34:00-161c161228c0bf8c5ce7ddb6c944659711.06.51752982108
80be4cc43ca4d02734b8e3771270bdae22011-01-01 09:36:00-1ba149eec842286d64af617372fe3d1c91.05.01622982108
995b5c2ec0badb42dc1e0204a6728ee302011-01-01 09:37:00-1d3e552cb8ef0db17c0b86b2a991ebc001.021.323227172108
transaction_idsales_datetimecustomer_idproduct_idquantitypricecategory_idparent_idstore_iddepartment_idsalesperson_id
81595264c46be3fd071b49472f8f3888a00988f2014-10-01 20:50:00393359353d6413f3603a0ac05c3616dff07582f241ab4ffa2bc611c1f504f23a1.012.0022225222396
81595274c46be3fd071b49472f8f3888a00988f2014-10-01 20:50:00393359353d6413f3603a0ac05c3616df351c168ec4e555703ecf6d13b1a22fb91.03.9022024222396
8159528cceb4e83f4516f7cb39d2fb6a5638cae2014-10-01 20:51:00966a62a203014b33b22c11fcdd8b7f5254b46942e898acd7b3fb1e0763ad042f1.09.951582522367
815952936089a5459f6b5c228438c5557306d122014-10-01 20:54:00099976a2741063dcc519d99b720d02808bb0c3e45ac5722d50d2ccca1ea920991.05.0017930222396
815953036089a5459f6b5c228438c5557306d122014-10-01 20:54:00099976a2741063dcc519d99b720d028079e4745ef8f6153f56cf3af3858c46951.05.0017930222396
8159531375d02cbac732e7655763366474ddb3c2014-10-01 20:55:00-1221eece5c0f81c92b7f01b0e2a0573eb1.012.0022225222396
81595322052ddaefe02020e1794a963c8e06d8a2014-10-01 20:56:004d0e1076f08aa463fbd18f25532914826d2ea0ac8f7b1263cf211715221769761.012.0022225222396
81595332052ddaefe02020e1794a963c8e06d8a2014-10-01 20:56:004d0e1076f08aa463fbd18f25532914825285184b43c1bc18d941d2a0ab6123e41.012.0016127222396
815953467d3974eba51b09878bc0ecab84c55552014-10-01 20:57:00-16d2ea0ac8f7b1263cf211715221769761.012.0022225222396
8159535fb6dd9d66e769a2e36407b04777b8dbc2014-10-01 21:00:00-144501de44aff5f038816bba26f0b46d11.016.5018021222396

Duplicate rows

Most frequently occurring

transaction_idsales_datetimecustomer_idproduct_idquantitypricecategory_idparent_idstore_iddepartment_idsalesperson_id# duplicates
62496d8734b9e2874373bfacf25e1c94aa5322012-05-24 14:27:00-15d867c72bbb1a8277429673844ddb8e11.05.003975215447137
30026673defde9a3efdf108d98a98c30b1e4b2012-11-15 16:40:00c0d64e6d4768df5200f0e624ab5942c4cd784f9c80112f207a06e23328ce0edb1.064.952984712462036
19970447092ce5b59d5a3b8e0bdedfb664e5d2012-10-26 14:58:00c0d64e6d4768df5200f0e624ab5942c4cd784f9c80112f207a06e23328ce0edb1.064.952984712461928
72023f8de84f89fd708945aa82b27b8568d342013-06-03 13:28:00-127da5b58a4414e20f99af385ebb76f941.03.483525920452424
72349f9f6bd7f0d7b1fe2705f04431e55a4d42013-06-28 15:33:00a43d0304223cf0c2129e223b89ab577acd784f9c80112f207a06e23328ce0edb1.02.972984710460424
545812c53a6f6d738458c5ac003dda3932c72012-01-04 12:01:00c0d64e6d4768df5200f0e624ab5942c45d867c72bbb1a8277429673844ddb8e11.010.003975215446323
549012d8423104f43bc45e7c9c9acc0508d62012-01-27 10:56:000383d192ffc2a8fbadceef1d50d1d6b6cd784f9c80112f207a06e23328ce0edb1.04.78298479460116
127122b32f7d00671f3bc3b4743001dbb51202012-07-11 15:08:00-1cd784f9c80112f207a06e23328ce0edb1.02.972984722452316
1494832f9c26e390b5e8c743eccaaa02c9c012013-10-04 15:49:00-1cd784f9c80112f207a06e23328ce0edb1.03.472984710460416
174553bad4843c9f4769e0f0d7b0e54cdd5a32012-11-22 16:54:00ebc1532b5f3654d8aadb91a2c68c808dcd784f9c80112f207a06e23328ce0edb1.064.952984712462016